Draft version August 31, 2010 

Preprint typeset using I^T^X style cmulateapj v. 8/13/10 



THE ARECIBO LEGACY FAST ALFA SURVEY: X. THE HI MASS FUNCTION AND Cl m FROM THE 40% 

ALFALFA SURVEY 

Ann M. Martin 1 , Emmanouil Papastergis 1 , Riccardo Giovanelli 1 ' 2 , Martha P. Haynes 1 ' 2 , Christopher M. 

Springob 3 , Sabrina Stierwalt 4 

Draft version August 31, 2010 

ABSTRACT 

The Arecibo Legacy Fast ALFA (ALFALFA) survey has completed source extraction for 40% of 
its total sky area, resulting in the largest sample of Hl-selected galaxies to date. We measure the 
HI mass function from a sample of 10,119 galaxies with 6.2 < log(Mjj j/Mq) < 11.0 and with well- 
described mass errors that accurately reflect our knowledge of low-mass systems. We characterize 
the survey sensitivity and its dependence on profile velocity width, the effect of large-scale structure, 
and the impact of radio frequency interference in order to calculate the HIMF with both the 1 /V ma x 
and 2DSWML methods. We also assess a flux-limited sample to test the robustness of the methods 
applied to the full sample. These measurements are in excellent agreement with one another; the 
derived Schechter function parameters are 0» (h 3 Mpc _3 dex _1 ) = 4.8 ± .3 X 10~ 3 , log(M*/M ) + 
2 log h 70 = 9.96 ± 0.02 and a = -1.33 ± 0.02. We find ft HI = 4.3 ± 0.3 xl0~ 4 hf \ 16% larger than 
the 2005 HIPASS result, and our Schechter function fit extrapolated to log(Mui/Mo) = 11.0 predicts 
an order of magnitude more galaxies than HIPASS. The larger values of flni and of M* imply an 
upward adjustment for estimates of the detection rate of future large-scale HI line surveys with, e.g., 
the Square Kilometer Array. A comparison with simulated galaxies from the Millennium Run and a 
treatment of photoheating as a method of baryon removal from Hl-selected halos indicates that the 
disagreement between dark matter mass functions and baryonic mass functions may soon be resolved. 
Subject headings: galaxies: distances and redshifts; — galaxies: dwarf — galaxies: luminosity function, 
mass function — radio lines: galaxies — surveys 



1. INTRODUCTION 

The disagreement between predictions of the number 
of low-mass dark matter halos and the observations of 
low-luminosity dwarf galaxies, commonly characterized 
as the 'missing satellite problem,' is reflected in the 
faint-end slopes of galaxy luminosity functions and neu- 
tral hydrogen (HI) mass fu nctions. Current dark mat- 
ter simulations and models (Bovlan- Kolchin et al J 120091 
Uenkins et al.ll200ll ) imply that the faint-end slope of the 
underlying mass function is a ~ —1.8, in agreement 
with the Press-Schechter analysis of cosmic structure for- 
mation ([Press fc Schechterlll974[ ). but observational evi- 
dence is consistent with a significantly shallower slope. 

There is hope of resolving this discrepancy by inves- 
tigating physical effects on the observed baryons that 
would not influence the underlying dark matter distribu- 
tion. For example, photoheating by the UV background 
can deplete baryons from low mass halos, reducing the 
number of luminous galaxies observable today. There ap- 
pears to be a characteristic halo mass, below which severe 

1 Center for Radiophysics and Space Research, 
Space Sciences Building, Cornell University, Ithaca, 
NY 14853. e-mail: amartin@astro.cornell.edu, pa- 
pastcrgis@astro.cornell.edu, riccardo@astro.cornell.edu, 
haynes @ astro . cor nell . edu 

2 National Astronomy and Ionosphere Center, Cornell Uni- 
versity, Ithaca, NY 14853. The National Astronomy and 
Ionosphere Center is operated by Cornell University under a 
cooperative agreement with the National Science Foundation. 

3 Australian Astronomical Observatory, P.O. Box 296, Ep- 
ping, NSW 1710, Australia, e-mail: springob@aao.gov.au 

4 Spitzer Science Center, California Institute of Tech- 
nology, 1200 E. California Blvd. 91125. e-mail: sab- 
rina@ipac.caltech.edu 



baryon d epletion could elim i nate the abundance o f dwar f 
galaxies ([Hoeft et al.l 12008ft : iHoeft fc Gottloeberl (|2010Q 
find this halo mass to be w 6 x 10 9 /i _1 M Q , and that 
it is robust against assumed UV background flux den- 
sity and simulation resolution effects. Other processes 
related to star fo rmation, such as supernova feedback 
(Efstathiou 2000) can remove gas from galaxies, prefer- 
entially removing baryons from those early galaxies resid- 
ing in weak potential wells. Understanding these bary- 
onic processes ha s the potential to res olve the missing 
satellite problem (|Simon fc Gehall2007l) . but it remains 
difficult to fu lly simulate baryons in forming and evolv- 
ing galaxies dGovernato et al.l 120071 IMaver et al.l 120081 : 
ICeverino fc Klvpinl 120091: IGnedin et all 120091) . and it is 
therefore important to develop other observational con- 
straints. 

Since low-mass dark matter halos are the most likely 
to suffer from baryon depletion, these effects may 
cause the shallow faint-e nd slopes observed in lumi- 
nosity, circular velocity ( Zwaan et all [2009), and HI 
mass functions (HIMFs). Detailed study of these in- 
fluences in the lowest-mass galaxies are only possible 
very nearby, and the dwarf galaxies in the Local Group 
have been shown to have great di versity in their star 
formation histories and m etallicities (jTolstov et alll2009l 
IGrebel fc G allagher 2004J), with some galaxies losing gas 
and ceasing star formation early while others h ave un- 
dergon e this process only recently. Recently, iRicottil 
(2009) has suggested that these halos may be able to 
re-accrete cold gas at late times, and proposes that the 
gas-bearing ultrafaint d warf Leo T ([Irwin "et~aLl [20071 
IRvan- Weber et al.ll200l may be an example of this pro- 



2 



Martin et al. 



cess. Such galaxies may then be observable in HI line su r- 
veys like the ALFALFA survey (IGiovanelli et alJ ^OlO). 

Blind HI surveys are ideal for probing these questions 
surrounding the lowest-baryon systems. HI line surveys 
are unbiased by properties like optical surface brightness, 
and ALFALFA in particular is designed to detect systems 
with lower HI masses than the blind surveys of the pre- 
vious generation, down to ~ 3 x 10 7 M m at the distance 
of th e Virgo cluster with SNR ~ 6.5 (jGiovanelli et al.l 
Since neutral gas fractions become large for dwarf 
galaxies, dominating the stellar mass, HI surveys are 
efficient at fi nding the extremely lqw-baryon-mass sys - 
tems locally (|Schombert et al.l 120011: iGeha et al.ll2006D . 
and the HIMF is a better measure of baryon content 
at the lowest masses. Furthermore, environment is well 
known to have an impact on gas reservoirs, with galaxies 
in clusters t ending to be HI defi cient compared to those 
in the field (jHavnes et al.l[l984l ) . The results of this bias 
as seen in the ALFALFA survey catalogs and in HI mass 
functions of various environments may provide insights 
to the relationship between HI gas densities, tidal and 
ram pressure stripping, and star formation. 

Surveys like ALFALFA which probe a cosmologically 
fair sample also provide a wealth of information on the 
rare galaxies at the highest masses. High-mass gas-rich 
galaxies constrain the cosmic density of neutral gas in 
the local universe, Qhi ■ H I contributes only about 1% of 
the baryon budget a t z=0 (iProchaska k Tumlinsonl l 2009t 
IFukugita k Peeblesll200l iFukugita et al.lll998l ). The HI 
mass function is necessary to estimate this with great 
precision in order to trace the evolution of the neutral 
gas fraction, measured through damped Lya systems at 
higher redshifts. 

HI surveys also have the advantage of combining a 
galaxy detection, a redshift, and a mass estimate in a 
single observation without followup. This is particu- 
larly important given that about 70% of galaxies in the 
blind ALFALFA catalog are new HI detections and many 
are altogether new redshifts, indicating that the conven- 
tional wisdom guiding targeted HI surveys toward galax- 
ies expected to contain large reservoirs was severely lim- 
ited. Finally, as simulations and semianalytic models 
of warm and cold gas in evolving galaxies improve, the 
HIMF can be used as a test of these results, as done in 
lObreschkow et al.l (|2009D through a comparison of mod- 
eled cold hyd r ogen gas in Millennium Run galaxies to the 
iZwaan et al.l (|2005l ) mass function (see §6.3|) . 

The first generation of blind HI surveys resulting in a 
measurement of the local HIMF contained few galaxies: 
iHenning et al.l (|2000D detected 110 galaxies in the South- 
ern Zone of Avoidance, and the Arecibo Dual Beam Sur- 
vey (ADBS) HIMF was based o n a sample of 265 galaxies 
([Rosenberg k S chneider! r2002| ) . Both found a faint-end 
slope a ~ —1.5, significantly steeper than what is found 
in other larger blind HI surveys. The published HIPASS 
HIMFs were based on more galaxies than previous blind 
surveys; th e function extracte d from the 1000 brightest 
detections (jZwaan et al.ll200"3l ) had a faint-end slope -1.3 
and the later paper , with a fuller catalog of 4315 sources 
(jZwaan et al.l l2005). found -1.37. At the low-mass end 
of the HIMF, there is clearly severe disagreement, and 
previous data did not include enough lo w-mass objects 
to rob ustly constrain masses < 10 8 M Q . iSpringob et al.l 
(2005) investigated a complete sample of 2771 optically- 



selected galaxies and found a shallow slope, a ~ —1.24. 
Improving the number of sources by, for example, in- 
creasing the area of a shallow survey is not enough, on 
its own, to resolve the issue; rather, increasing the vol- 
ume over which low-mass sources arc detectable has the 
largest impact. Distance uncertainties are largest nearby, 
so a shallower survey will te nd to base its low-m ass slope 
on more uncertain objects ([Masters et al.ll2~004D . 

The ALFALFA survey catalogs, including those previ- 
ously p ublished (|Giovanelli et aLll2007t iSaintonee et al.l 
1 20081: Kent et al] 120081 : iStierwalt et"abl 120091 : 
iMartin et al.l 120091 ) and those about to be pub- 
lished (Haynes et al. 2010, in prep), now represent 
~ 40% of the final survey area, and the HI mass function 
presented here considers a sample of ~ 10000 Hl-selected 
galaxies. In the following section, we will discuss the 
ALFALFA dataset (§2). In and SpQJ respectively, 
we describe the 1 /V max method of estimating the HIMF 
from corrected galaxy counts, and the two-dimensional 
Stepwise Maximum Likelihood (2DSWML) method. 
Details of these methods are discussed in Appendices A 
and B. After presenting the results of the global mea- 
surement of the HIMF along with VLhi in §4 and 5, we 
will discuss the results as compared to the expectations 
of dark matter simulations and those including cold gas, 
addressing the divergence between HIMF slopes and 
that predicted by the Press-Schechter formalism (§6). 

2. ALFALFA DATASET 
2.1. The ALFALFA Survey 

The ongoing ALFALFA survey takes advantage of the 
new multipixel ALFA receiver at the Arecibo Observa- 
tory. When complete, the survey will have measured 
> 30, 000 galaxies in the 21 cm line out to z ~ 0.06 with 
a median redshift of ~ 8000 km s _1 . The survey is more 
sensitive than HIPASS, with a 5er detection limit of 0.72 
Jy km s" 1 for a source with profile width 200 km s" 1 in 
ALFALFA compared to a 5<r se nsitivity 5.6 Jy km s" 1 
for the same source in HIPASS (jGiovanelli et al.l 120051) . 
Narrow profile widths, down to ~ 15 km s _1 , allow us 
to probe extremely small objects. ALFALFA detects ob- 
jects with neutral hydrogen masses Mjii ~ 3 x 10 7 M© 
out to the distance of the Virgo cluster. In addition 
to greater sensitivity, ALFALFA probes gas-rich galaxies 
in the local universe with greater velocity resolution (11 
km s _1 after Hanning smoothing vs. 18 km s _1 ) and a 
deeper limiting redshift (18000 km s _1 vs. 12700 km s _1 ) 
than HIPASS. Our significantly improved survey depth 
for low-mass objects allows the ALFALFA survey to bet- 
ter constrain the low-mass slope of the HI mass function. 

ALFALFA survey data are acquired in a minimally- 
invasive drift scanning mode, in two passes ideally sepa- 
rated by several months, and individual 600 s drift scans 
are combined into three-dimensional data grids cover- 
ing 2.4° in both R.A. and deck; it therefore takes many 
nights of observations to complete a grid from which ex- 
tragalactic sources can be extracted. 

Confidently detected sources are assigned one of three 
object codes, where Code 1 refers to a reliable extragalac- 
tic detection with a high S/N (> 6.5), Code 2 refers to ex- 
tragalactic sources with marginal S/N (4.5 < S/N < 6.5) 
confirmed by an optical counterpart with known optical 
redshift matching the HI measurement, and Code 9 refers 



ALFALFA HI Mass Function 



3 



to High Velocity Clouds (HVCs). For this analysis, we 
consider only objects designated Code 1, since we are 
interested in extragalactic objects with well-known se- 
lection criteria. Code 1 objects have a reliable S/N, a 
good match between the two polarizations that are inde- 
pendently observed by ALFALFA, a clean spectral profile 
and, in almost every case, a confident match with an opti- 
cal counterpa rt. The sig n al det ection pipeline, discussed 
at length in iSaintongel ([2007D . combines a matched- 
filtering technique for identifying source candidates with 
an interactive process for source confirmation and pa- 
rameter measurement. This technique is estimated to 
result in a reliability of candidate detections ~ 95% for 
Code 1 objects, with a completeness better than 90% for 
the narrowest galaxies above the prescribed S/N thresh- 
old. The subsample of Code 1 objects provides a robust 
sample for the HIMF. 

2.2. Derived Parameters 

Published ALFALFA catalogs contain a set of mea- 
sured parameters (including coordinates, heliocentric ve- 
locity, line profile velocity width W50 measured at the 
50% level of two profile peaks, integrated flux density 
Smt, S/N, and noise figure a rms ) in addition to a dis- 
tance estimate and a derived HI mass in solar units, ob- 
tained from the expression Mjji = 2.356 x 10 5 D\ Ipc Si nt - 
Our distance estimates are subject to errors due to each 
galaxy's unknown peculiar velocity, which translate into 
mass errors. The fractional distance error due to peculiar 
velocity decreases with increasing distance (the so-called 
'Eddington effect'), so the lowest-mass galaxies which are 
only found nearby are most prone to this error, our treat- 
ment of which is discussed in detail in §3.21 

2.3. Profile Width- Dependent Sensitivity 

ALFALFA'S ability to detect a signal depends not only 
on the integrated flux, but also on the profile width W50 
(km s _1 ). Fig. [I] displays the distribution of sources de- 
tected by ALFALFA. Rather than a single flux limit, the 
ALFALFA detection threshold is dependent on both Si n t 
and profile width W50, and we find that this relationship 
changes above W50 ^400 km s _1 . We fit the Si n t,th re- 
lationship empirically to the data, rather than using the 
assumed expression above. Due to differences in the two 
methods we employ to calculate the HIMF, we consider 
two different threshold cuts, described separately in i )3.3l 
and EH 

2.4. The 40% ALFALFA Survey Sample 

ALFALFA catalogs have been extracted for a large con- 
tiguous region in the southern Galactic hemisphere (i.e. 
anti- Virgo direction) (22' 1 < a < 03' 1 , 24° < 6 <32°), 
and two regions in the northern Galactic hemisphere (i.e. 
Virgo direction) (16 h 30 m < a < 07 h 30 m , 4° < 6 <16° 
and 24° < S < 28°), with coverage totaling 2607 deg 2 
or ~ 40% of the final ALFALFA volume. This includes 
the previously published catalogs with a total of 2706 
extragalactic source measurements ([Martin et all 120091: 
Stierwalt et~aT] 120091: Rent et all [20081: iSaintonge et al.l 
2008: IGiovanelli et al.ll2007fT in addition to an upcoming 
large online data release (Haynes et al. 2010, in prepj^- 

5 This data release includes an additional strip of coverage, 22 h 
< a < 03^, 14° < S <16°, which is excluded here in favor of large 



This primary dataset includes both Code 1 (n = 10452) 
and Code 2 (n = 2750) galaxies in addition to Code 9 
(n = 629) HVCs, where this figure includes measured 
subcomponents of larger cloud complexes. 

From the primary dataset, we have selected the 40% 
ALFALFA Survey sample, hereafter a. 40. This sample 
has been selected to include only Code 1 objects, and the 
total sample size is further reduced by the exclusion of 
galaxies found beyond 15,000 km s _1 , where radio fre- 
quency interference from FAA radar makes ALFALFA 
blind to cosmic emission in a spherical shell ~ 10 Mpc 
wide. The final a. 40 sample contains 10,119 Code 1 
galaxies, for a detection rate 3.9 deg -2 compared with 
the HIPASS detection rate ~ 0.2 deg" 2 (5317 extra - 
gala ctic sources over 2 9,000 deg 2 ; iMever et afl ([2004T ) 
and iWong et al.l ([2006)). While rich in absolute num- 
ber, HIPASS does not extend deep enough in redshift to 
sample a cosmologically fair volume. 

In Figs. [2] and [3] we present the redshift distribution 
of the 10,119 Code 1 objects in a. 40 as a set of cone di- 
agrams by region in the survey. The two most obvious 
features in Fig. [5] are the prominent void in the fore- 
ground of the Pisces-Perseus supercluster, leading to the 
dearth of detections out to about 3000 km s -1 , and the 
portion of the main ridge of that supercluster that cuts 
across the diagram. In the top panel of Fig. (3[ the nearby 
Virgo cluster is prominent, as is the Coma supercluster. 
ALFALFA probes a wide variety of environments in the 
local universe, and will soon study the overall proper- 
ties of Hl-selected galaxies as a function of environment 
(Saintonge et al. 2010, in prep). 

Fig. [4] displays histograms of the statistical properties 
of the a. 40 sample. From (a) to (d), these histograms 
represent the heliocentric velocity, velocity width W50, 
integrated flux Sj n {, and S/N properties. In particular, 
note that the S/N is high for all detections, since Code 
2 objects have been excluded from this analysis. For 
clarity, the histogram of the HI masses of galaxies in the 
sample is plotted separately, in Fig. [5j On the low mass 
end, where ALFALFA can place strong constraints on the 
faint-end slope of the HIMF, the a. 40 sample contains ~ 
340 galaxies with \og(M H i /M Q ) < 8.0 and — 114 with 
log(Mffi /Mq) < 7.5; on the high mass end, which is best 
probed by surveys with deep redshift limits, there are ~ 
50 galaxies with \og(M m /M Q ) > 10.5. 

The large sample size of ALFALFA, extending over a 
range of HI masses, is one of its key strengths in relation 
to the problem of characterizing the density of neutral 
gas in the present-day universe. With such a large num- 
ber of galaxies, we can approach our calculation of the 
HIMF in two distinct ways. First, using the entire sam- 
ple and a well-known characterization of our sensitivity, 
we can apply corrections and obtain the overall func- 
tion without excluding sources. Second, however, we can 
make stringent integrated flux cuts and use only those 
galaxies bright enough to be detectable irrespective of 
other properties (e.g. profile width). The sample con- 
tains ~ 3500 galaxies with an integrated flux > 1.8 Jy km 
s" 1 , which provides a strict cut above which our objects 
are detected regardless of profile width. This subsample 
size is comparable to the full sample size for previously- 
published HIMFs such as HIPASS, but samples a fair 

contiguous areas. 



4 



Martin et al. 



cosmological volume. This subsample, referred to here- 
after as a. 40 1.8, provides a test case for analyzing the 
quality of the HIMF measurement for the full a. 40 sam- 
ple. The precise details of the calculation, of ALFALFA'S 
sensitivity, and of the corrections applied to the HIMF 
calculated from a. 40, make up the bulk of the following 
sections and of Appendices A and B. 

3. DETERMINATION OF THE HIMF 

3.1. The HI Mass Function 

The HI mass function, like galaxy luminosity functions, 
is usually parametrized as a Schechter function of the 
form 

dn ( AlHr\ a+1 m hi 

4>(M HI )= = In 10 6. -f^ e" — 

d\ogM HI \ M* J 

The parameters of interest are the faint-end slope a, the 
characteristic mass log M* , and the scaling factor . 

(j){Miji) has historically be en calculated in one of two 
ways. The T,l/V max method (jSchmidtlll968l) can be un- 
derstood by analogy to a purely volume-limited sample, 
in which case the HIMF would be obtained by the galaxy 
counts divided by the total volume of the survey. The 
T,l/V ma x method treats each individual galaxy in this 
way, by weighting the galaxy counts by the maximum 
volume V ma x,i within which a given source could have 
been detected. This weighting strategy allows the in- 
clusion of low-mass galaxies, visible only in the nearby 
Universe, in the same sample as rare high-mass galaxies, 
found only in larger volumes. Additionally, the weights 
may be adjusted in order to correct for a variety of se- 
lection effects, large-scale structure effects, and missing 
volume within the dataset, so that a well-characterized 
survey can robustly measure the HIMF. 

An alternative method, the Two Dimensional Step- 
Wise Maximum Likelihood (2DSWML) approach, was 
applied to the HIPASS measurements of the HIMF 
(|Zwaan et al.l 120031 l2005f ). This method is designed to 
make the calculation of the HIMF less sensitive to local 
large scale structure, since shallow blind HI catalogs are 
contaminated by the richness of the Local Supercluster. 
If l/Vmax is used without correction for this overdensity, 
the resulting HIMF will overestimate the contribution 
by low-mass galaxies and steepen the faint-end slope a. 
Stepwise maximum likelihood methods, by contrast, are 
designed to reduce this bias, by assuming that the shape 
of the HIMF is the same everywhere and then obtaining 
the 4>(Mhi) that m aximizes the probabili ty of the ob- 
served distribution (|Efstathiou et al.l Il988f ). Given the 
dependence of the ALFALFA survey's sensitivity on both 
mass and profile width ( §2.3|) . a Two-Dimensional Step- 
wise Maximum Likelihood (2DSWML) approa ch is nec- 
essar y to calculate the HIMF for the full sample (jLovedavl 
2000). 2DSWML maintains the main advantages of the 
SWML method, which are its robustness against den- 
sity fluctuations in the survey volume and its model- 
independent approach. 

In this work, we apply both the 1/V maa; and the 
2DSWML method for various reasons. Given our knowl- 
edge of our sample's characteristics and sensitivity, the 
1/V mal , method is simple to apply and straightforward 
to assess for potential bias. We can account for large 
scale structure and other selection effects by applying 



well- motivated corrections (discussed in ij3.3|) . Perhaps 
more significantly, this method also allows us to quantify 
and understand those effects on the ALFALFA survey. 
In particular, a goal of ALFALFA is to further probe the 
differences between HI mass functions in different envi- 
ronments; the 2DSWML assumption that the shape of 
the function is the same throughout a sample may not 
be valid. By contrast, the 2DSWML method is designed 
to be more resistant to effects from large-scale structure, 
and also results in a calculation of the selection function 
which can be used in future analysis of the sample via, 
for example, the two-point correlation function. A com- 
parison of the 1/V max and 2DSWML methods as applied 
to a. 40 is considered in §6 . 

In both the 1/V m ax and 2DSWML analyses, we have 
used 5 mass bins per dex, and have found that the HIMF 
is not strongly affected by choice of bin size. In the case of 
2DSWML, we also bin by profile velocity width, and find 
no significant difference for bin sizes between 2 and 20 
bins per dex. The two main sources of error are counting 
statistics within the bins and mass errors. 

3.2. Errors on Distances and Masses 

Minimizing and taking into account distance errors is 
key to robust estimation of lumin osity and mass func- 
tions, in particular at the faint end. iMasters et al.l (|2004l ) 
considered how strongly distance uncertainties will tend 
to affect a given local volume survey's estimate of the 
faint-end slope of the mass function. In that work, the 
authors accounted for distance errors by constructing a 
mock catalog, with masses assigned from a chosen HIMF 
and with the spatial distribution determined from the 
density field o f the IRAS Point Sourc e Catalog Redshift 
survey (PSCz; iBranchini et al.l (fl999)). They concluded 
that a survey toward the Virgo cluster, like a portion of 
the sample considered here, will overestimate distances 
to those galaxies if pure Hubble flow is used, since ob- 
jects in that field are falling into Virgo. Since the HI 
mass depends on distance as D 2 , this has serious conse- 
quences for the faint-end slope of the HIMF. Therefore, 
work in this region relies both on the development of 
well-constrained local velocity models from primary and 
secondary distance catalogs and on a careful considera- 
tion of the effects of distance uncertainties. We consider 
th e Vir go cluster as a special case of this general problem 

in gnu 

These difficulties arise precisely because the lowest 
mass objects can be detected only at small distances, 
so that the fractional distance errors due to deviations 
from Hubble flow most strongly affect the most inter- 
esting bins of the mass function. The best distance es- 
timates, primary distances based on, e.g., Cepheids or 
the tip of the red giant branch, can only estimate dis- 
tances to within ~ 10% error, so beyond cz ~ 6000 km 
s _1 the uncertainties on distances obtained via a pri- 
mary method and those obtained assuming pure Hubble 
flow become comparable, and the latter is typically used 
for simplicity. Within that distance, however, the dis- 
tance uncertainties can have a very strong influence, up 
to 100% in the case of the Virgo cluster. 

To minimize distance uncertainties, the ALFALFA sur- 
vey has adopted a distance estimation scheme that makes 
use of a peculia r velocity flow model for the local Universe 
(|Mastersl 12005). This parametric multiattractor model, 



ALFALFA HI Mass Function 



5 



based on t he SFI++ catalog of g alaxies with Tully-Fisher 
distances (Springo b et al.ll2007l ). includes two attractors 
(Virgo and a Great Attractor) along with a dipole and 
quadrupole component. Distances to almost all a. 40 
galaxies within 6000 km s _1 are estimated from the flow 
model. Beyond czcmb = 6000 km s _1 , the model is not 
well-constrained, so distances are estimated from Hub- 
ble flow (H = 70 km s" 1 Mpc" 1 ). Within 6000 km s" 1 , 
some galaxies have measured primary distances, which 
are applied in our scheme, and other galaxies are known 
to belong to a group, in which case the grou p's mean 
veloci ty is used for distance estimation. The iMastersI 
(2005) flow model also provides error estimates, con- 
strained by the fit of the model to the observed velocity 
field and with a minimal error based on the local velocity 
dispersion 163 km s _1 . When distances are estimated us- 
ing pure Hubble flow, the error is estimated to be ~ 10% 
via the assumption that peculiar velocities are ~ a few 
hundred km s _1 . 

Mass errors for individual galaxies in our sample are 
calculated from the measured error on the integrated flux 
and an estimated error on the distance, which is the 
larger of the local velocity di spersion 1 63 km s _1 , the dis- 
tance error estimate of the IMastersI (120051 ) flow model, 
or 10% of the distance. Because the mass error shifts 
galaxies into different bins of the HIMF, the relationship 
between these errors and the final HIMF parameter er- 
rors is complex. We deal with these errors by calculating 
several hundred realizations of the HIMF after randomly 
assigning flux and distance errors to each galaxy to find 
the spread in each mass bin. 

There is a complication on the high-mass end of the 
sample, as well. Arecibo's relatively large beam size at 
21 cm (~ 3.5 arcmin) can cause source confusion at large 
distances, where we also find our largest-mass objects. 
When this occurs, ALFALFA may be detecting more 
than one individual gas-rich galaxy as a single source, but 
in cases of interaction it's also possible that the galaxies 
involved are part of a single, large HI envelope. While 
higher-resolution followup would be required to fully re- 
solve this issue, we have investigated optical images and 
redshift catalogs for the highest mass (logMui > 10.5) 
ALFALFA detections, and have found that the majority 
of these objects are not likely to be blends of HI emis- 
sion from an interacting system and some others are close 
pairs that are likely to share a single gas envelope. 

3.3. 1/Vmax Method 

For each galaxy in a. 40, V max ,i is calculated based on 
that galaxy's HI mass Mj, the minimum integrated flux 
Smin,i at which such a galaxy is detected in ALFALFA, 
and finally the distance T) max ^ corresponding to that 
limit. The calculated V max ,i, corresponding to the effec- 
tive search volume for that galaxy, excludes volume that 
is not covered by ALFALFA, including volumes where 
detection ability, and therefore effective search volume, 
is reduced by the appearance of radio frequency interfer- 
ence at the corresponding frequency. Galaxies are binned 
by mass and c/)(Mhi) is calculated by summing the re- 
ciprocals of V max . 

By weighting the count for each galaxy, the l/V max 
method can be corrected for a variety of known system- 
atic effects. The major corrections applied to the HIMF 
for this sample address (1) missing volume, (2) the pro- 



file width-dependent sensitivity of the survey, and (3) the 
known large scale structure in the local volume. 

Sources of radio frequency interference contaminate 
the signal in regions of frequency space corresponding 
to spherical shells in the survey volume. This effectively 
reduces the search volume of the overall survey. Fig. 
[5] shows the average relative weight, compared to 100% 
coverage, within the a. 40 survey volume as a function of 
velocity. The large dip between 15000 and 16000 km s _1 
is due to the FA A radar at the San Juan airport, and 
because of this extreme loss of volume at large distances 
we restrict the a. 40 sample to only those galaxies within 
15000 km s _1 . Given our knowledge that the relative 
weight is less than 1.0 at specific distances, the Y max 
value calculated for a specific galaxy is reduced to re- 
flect the loss of effective search volume. This correction 
is not significant for the lowest-mass galaxies, but more 
generally, the correction is very small. The effect on the 
final Schechter parameters for the HIMF is on the order 
of 2%. 

As discussed in 12.31 ALFALFA'S detection ability is 
dependent on the profile velocity width of the signal, 
W50, in km s _1 , rather than strictly on the integrated 
flux of the signal. To obtain an expression for this de- 
tection limit, we used the data itself, as displayed in 
Fig. [T] The dependence of ALFALFA sensitivity on 
both flux and profile width, described in £12.31 has the 
further complication of affecting the survey's complete- 
ness, and this must be accounted for in order to extract 
the underlying HIMF. The distribution in Fig. [1] indi- 
cates that ALFALFA finds many galaxies with low fluxes 
and narrow widths, but there is a deficiency of galax- 
ies with low fluxes and large widths. Because we have 
no knowledge of the true distribution below ALFALFA'S 
detection capability, we have developed a completeness 
correction that takes advantage only of the data, mak- 
ing no assumptions about the potentially intrinsically 
small unobserved population. The profile width com- 
pleteness correction most strongly affects galaxies with 
~ 9.0 < log(M ff //M ) < 10.0, and has a very small 
influence (< 2%) on both the faint-end slope a, since 
low-mass (i.e. narrow velocity width) galaxies aren't af- 
fected, and log(M*), since the counts in the high-mass 
bins are large enough to robustly constrain this. This is 
essentially a galaxy counting correction, so its primary 
influence is on 0*, increasing that parameter by a fac- 
tor of 20%. The full details of this completeness correc- 
tion are described in Appendix A. The validity of this 
completeness correction, which we have applied to the 
full sample, is tested in N4.ll and 15.11 by calculating the 
HIMF using an integrated flux cut, which allows us to 
neglect the biased sensitivity dependence on width. By 
comparing the resulting HIMF in both cases, we assess 
the impact of this correction. 

The most significant bias in the l/Y m a X calculation 
of the HIMF is that due to the large-scale structure of 
the galaxy distribution. Blind HI surveys tend to be 
relatively shallow and are thus biased by the overden- 
sity of the local volume, which particularly affects the 
lowest-mass Hi-rich galaxies that are only found nearby. 
If a correction for large-scale structure is not applied, we 
overestimate the impact of low-mass galaxies on the over- 
all HIMF, therefore boosting the faint-end slope a arti- 
ficially. We discuss this correction in Appendix A. The 



6 



Martin et al. 



large-scale structure volume correction has only a very 
weak effect on log(M»), but the effects on a (~ 10%) 
and (~ 30%) are large. Since this correction is so 
significant, it is sensitive to the details of the density re- 
construction used. Agreement between the 1/V max and 
2DSWML results provide the best indication of the qual- 
ity of this correction. 

However, large-scale structure introduces the further 
bias of selectively reducing counts in mass bins that are 
primarily detectable in void volumes, and the weight- 
ing scheme correction cannot account for that. The 
voids in the Pisces-Perseus region between 3000 and 
8000 km s _1 , visible in Fig. [51 in particular, bias 
that portion of the a. 40 sample against galaxies with 
8.5 < log(AT#//M ) < 9.0, leading to a systematic un- 
dercounting in those bins. Because the 1/V max method 
is sensitive to large scale structure, this undercounting 
introduces a spurious 'bump' feature into the HIMF, de- 
scribe in detail in Section 14.11 

3.4. 2DSWML Method 



As discussed in i j3.ll and l3.3i the main disadvantage of 
the 1/V rn ax method is its potential sensitivity to large- 
scale structure. If large-scale structure corrections were 
not adopted, the density of low Hi-mass galaxies would 
be systematically overestimated, since most of these 
galaxies are detectable only in the very local, substan- 
tially overdense universe, including the Virgo Cluster 
and the Local Supercluster. This would bias the low- 
mass slope of the Schechter fit to the HIMF (a) , weaken- 
ing one of the major strengths of the ALFALFA dataset, 
which is its ability to probe the population of extremely 
low Hi-mass galaxies over a wide solid angle for the first 
time. 

The original SWML method is applicable to sam- 
ples selected by integrated flux. It assumes that 
the observed galaxy sample is drawn from a common 
HI mass function throughout the survey volume, de- 
noted by (j}{M.Hi)- Unlike most Maximum Likelihood 
methods, which assu me a functional form for <p(Mui) 
(jSandage et alJ 11979( 1 . SWML splits the distribution in 
bins of to = \og(Mni /Mq) and assumes a constant dis- 
tribution within each logarithmic bin. In this way, the 
value of the distribution in each of the bins becomes a pa- 
rameter, <j)j (J = 1, 2, iV m ), which is adjusted in order 
to maximize the joint likelihood of detecting all galax- 
ies in the sample, hence yielding a Maximum Likelihood 
estimate of the mass distribution. Since the values of 
the parameters are free to vary independently, the pro- 
cedure above is completely general and does not assume 
any functional form for the distribution a priori. 

In the case where the sample is not integrated- 
flux— limited and the selection function depends on ad- 
ditional observables, the SWML technique has to be ex- 
tended to take into account the underlying galaxy dis- 
tribution in all the physical properties that enter the 
calculation of the selection function. In the case of 
a. 40, the limiting integrated flux depends on the galaxy 
profile width W50 and thus the method needs to con- 
sider the joint two-dimensional distribution of galaxies in 
both HI mass and observed velocity width, 4>(Mjji, W50). 
2DSWML relies on the assumption that the sample is 
statistically complete. Since ALFALFA'S sensitivity to a 
source is dependent on both its integrated flux and its 



velocity width W50 fi )2.3p . we fit a strict completeness 
threshold to the observed relationship as seen in Fig. [1] 
and exclude galaxies falling below this completeness cut. 

The details of the 2DSWML method and its applica- 
tion to a. 40 are given in Appendix B. 

3.5. HIMF Error Analysis 

The simplest source of error in the estimate of the 
HIMF is from Poisson counting errors in the bins, which 
is added to the other sources of error considered next. 
The relationship between errors on corrections applied to 
individual galaxies and errors on the final HIMF points 
and measured parameters is complex. Mass errors, for 
example, may shift galaxies in the sample from one bin 
to another as discussed in £13.21 so it is not possible to 
analytically calculate the error on a particular bin. In or- 
der to treat these errors appropriately, we create > 250 
realizations of the HIMF for each of the results shown in 
§4 and 5. The error on Si n t is measured in the ALFALFA 
source extraction pipeline, and we have estimated errors 
on the distance for each galaxy in the sample. Each 
of these contributes to the mass error, and we apply a 
Gaussian random error to each galaxy's mass in each 
realization. The spread in the bin values across the en- 
semble of realizations contributes to the overall error in 
each point. We consider errors due to uncertain parame- 
ter estimation in the relationship between \og{Mm /Mq) 
and the Gumbel distribution parameters /1 and (3 in the 
same way. This results in a HIMF that has taken known 
sources of error into consideration. 

Sources of systematic bias remain, particularly for the 
1 /V max measurement which is sensitive to the large-scale 
structure in the galaxy distribution. The effects of large- 
scale structure and of cosmic variance will be reduced as 
the survey continues, increasing its volume and coverage 
of varied cosmological environments. 

In order to account for errors that are more difficult to 
quantify, we follow the example of iZwaan et al.1 (2005) 
and jackknife resample 21 equal-area regions. The resam- 
pling technique will help account for residual large-scale 
structure beyond that which we have corrected, and also 
for any systematic survey effects that change spatially 
across the sky or temporally throughout the survey's ob- 
servations. 

3.5.1. 2DSWML Error Estimates 

The 2DSWML approach introduces another source of 
error. We assign errors on the parameters <j)jk, intro- 
duced in £ |3 .41 via t he inv erse o f the information matri x 
following lLovedavl (|2000l) and lEfstathiou eFaLl (fl988ll . 
The general form of the information matrix for a likeli- 
hood function C that depends on a set of parameters 9 
is given by 



1(0) 



de m oe, 



InC 



no, 



oe m 9 ae„ 9 ae n 9 




:9 



(2) 



where g is a constraint of the form g(9) = 
0. We choose to apply the constraint g — 



AtoAui — 1 = 0, with 



/?i = P2 = 1 and reference values for the HI mass and 
W50 equal to the a. 40 sample mean. The result is an 



ALFALFA HI Mass Function 



7 



error estimate for the parameters ifijk, i.e. the value of 
the HIMF in each mass bin, and is added in quadrature 
to the other sources of error described above. 

4- 1/Vmax METHOD: RESULTS 
4.1. Global HI Mass Function and ttni 

The global HI mass function derived from the a. 40 
sample via the 1/V max method is presented in the top 
panel of Fig. Overplotted error bars have been derived 
as described above; mass errors due to errors on flux and 
distance estimates are reflected in the errors on the HIMF 
points, rather than on the mass-axis bin positions, since 
these errors change the bin counts. 

The best-fit Schechter function describing this HIMF 
is overplotted as a dashed line. The derived param- 
eters are 0* (hf Mpc^dex" 1 ) = 6.0 ± .3 x 10~ 3 , 
log(M*/M ) + 2 log h 70 = 9.91 ± 0.01 and a = -1.25 
± 0.02. However, the large-scale structure in the AL- 
FALFA survey regions has introduced a 'bump' into this 
measurement of the HIMF. The feature visible in Fig. 
[7jat log(Mni /Mq) ~ 9.0 does not appear to be intrin- 
sic to the Hi-rich galaxy population. However, previ- 
ous work on luminous galaxies has suggested that the 
shape of luminosity and mass functions may be more 
complex than single Schechter functions. Luminosity 
functions in clusters, such as Coma and Forn a x, are 
inconsistent with sing le values of a; iTrenthaml (1998) 
has recommended a 'composite' luminosity function that 
steepens for both bright and faint objects and flattens 
out in between, which provides a 'dip' feature. Single 
Schechter functions provided a poor fit to 2dFGRS lu- 
minosity functions (|Madgwick et al.l I2002T ) . and results 
from th e Sloan Digital Sky Survey a lso suggest that a 
second (|Baldrv et al.ll2004t ) or third (|Li fe White! I2009T ) 
Schechter function component best describes the under- 
lying population of galaxies at low redshift. While, given 
these findings, it is possible that the feature in Fig. [TJsug- 
gests a complex shape in the HIMF, it is more likely that 
the feature is spurious, as we discuss below. 

Such features occur because the 1 /Y max method is sen- 
sitive to large scale structure. Because the survey's HI 
mass sensitivity varies with distance (i.e., a. 40 is not a 
volume- limited sample) , each mass bin in the HIMF cor- 
responds to some preferred distance at which ALFALFA 
is most sensitive to galaxies in that mass bin. Extended 
large-scale structures can therefore change the shape of 
the HIMF in bins corresponding to the distance of those 
features. Because of the large sample size of a. 40, it 
is possible to separately investigate the three survey re- 
gions represented by the cone diagrams in Figs. [2] and 
[3] and to isolate the structures that contribute to such 
features. Specifically, the 'bump' feature in Fig. [7] is due 
to a lack of sources in the foreground of the Great Wall 
and an overabundance within the Great Wall, clearly ev- 
ident in Fig. [3] The large scale structure correction 
( £13.31 and Appendix A) reduces this feature, but cannot 
totally eliminate it, in part because density maps used 
to correct for large scale structure are smoothed to ~ a 
few Mpc scales and can underestimate extremes in the 
density contrast. Features such as this one will be re- 
duced as the ALFALFA survey continues and the sample 
grows. The 2DSWML method is not sensitive to large 
scale structure and does not produce this feature (§5). 



This feature appears significant in part because our 
statistical errors on the HIMF points are so small, but 
it leads to a poor fit and an underestimate for the faint- 
end slope a. This is clear in Fig. [SI which displays 
the residual between the 1/V m ax HIMF points and the 
derived best-fit Schechter function in the top panel and 
shows that the Schechter function systematically over- 
and under-estimates the HI mass function due to this 
feature. 

While this feature is well-understood, it has the unde- 
sirable effect of artificially reducing the faint-end slope a. 
In an effort to reduce the effect of this spurious feature 
and to better fit the points, we fit the sum of a Schechter 
function and a Gaussian; the Gaussian component serves 
to filter out the feature, leading to a better estimate of a. 
The results are shown as the solid line in Fig. [7] with the 
residuals shown in the bottom panel of Fig. [SJ This fit 
significantly improves the reduced x 2 j and the residuals 
are small and, near \og{Mui /Mq) ~ 9.0, more randomly 
scattered about in contrast to the top panel of Fig. [Si 
However, there is larger uncertainty in the parameters 
in this case, since each function is constrained by fewer 
points. The Schechter function parameters, displayed in 
TableHJ are log(M*/M ) + 2 log h 70 = 9.95 ± 0.04 and 
a = -1.33 ± 0.03. The Schechter function measurement 
of 0„ (h^ Mpc~ 3 dex- 1 ) = 3.7 ± .6 X 10" 3 , however, 
has been affected by the addition of the second compo- 
nent to the fit, and we therefore defer to the 2DSWML 
measurement of that parameter. 

The Gaussian parameters are not included in Table [TJ 
since they are used to filter out the 'bump' feature and 
are not expected to have physical meaning. The best-fit 
Gaussian has peak height (I170 Mpc -3 ) 5 ± 1 X 10~ 3 , 
mean log(M M /M©) + 2 log h 70 9.28 ± 0.06 and spread 
in log(M M /M Q ) + 2 log o — 0.41 ± 0.03. 

We conclude that the proper values of a and 
log(M»/M Q ) extracted from the 1/V max method are - 
1.33 ± 0.03 and 9.95 ± 0.04, respectively. Table □ lists 
both the spurious l/~V m ax Schechter function parame- 
ters as well as the parameters found when a Gaussian is 
added to fit the spurious feature. The addition of the 
Gaussian brings the 1/V max results for the parameters 
a and M* into excellent agreement with the 2DSWML 
method and the flux-limited a.40i.s subsample results. 

As an additional test of our corrections for profile width 
sensitivity, we have derived the 1 /V max HIMF from the 
integrated flux-limited subsample a.40i.s (described in 
§2.4|) . This mass function is corrected for large-scale 
structure and include mass errors, but is not subject to 
the same bias against broad HI profiles. The a.40x.s 
HIMF is well-fit by a pure Schechter function. The re- 
sults are listed in Table [U The a.40i. 8 HIMF is consis- 
tent with those derived from the full a. 40 sample. We 
therefore conclude that our survey sensitivity is well- 
characterized and that our measurements based on the 
full sample are complete and representative. However, 
since this limited sample does not probe the galaxies at 
the extremes of the mass function, it is subject to larger 
errors on the points and in the parameters. 

4.1.1. Measurement of 1 

The density Qhi of neutral hydrogen in the local Uni- 
verse, expressed in units of the critical density, can be 
calculated in two ways from the derived HI mass func- 



Martin et al. 



tion. Integrating analytically over the best fit Schechter 
function gives VL HI = 0* M* L(2 + a)= 4.4 ± 0.3 xl0~ 4 
hfo 1 , slightly (16 %) higher than the final HIPASS value 
3.7 xI(T 4 hfo 1 (|Zwaan et alJl2005f ). Using the binned 
points directly, we find the same result: SIri — 4.4 ± 
0.1 xl0~ 4 hyg 1 . This agreement is an indication that our 
findings are well-represented in the high-mass bins by our 
Schechter function fit, despite the spurious feature. VLhi 
carries a small error since it is negligibly affected by the 
mass and distance errors on the faint end. 

In Fig. [9j we show the contribution of each 1/V max 
mass bin to £Ihi as filled circles. The total density of 
neutral hydrogen in the local Universe is dominated by 
galaxies with 9.0 < \og{Mni /Mq) < 10.0, and in these 
bins w e measure the HIMF to be larger than lZwaan et all 
(2005) do, thus finding a larger value of Qhi- The AL- 
FALFA survey extends further in redshift than HIPASS, 
with a median redshift ~ 8000 km s _1 compared to ~ 
3000 km s _1 , allowing us to detect significantly more 
high- mass objects f £|6.2[) . 

5. 2DSWML METHOD: RESULTS 

5.1. Global HI Mass Function and ttni 

The HIMF derived from a.40 through the 2DSWML 
method is shown in Fig. 1101 The derived parameters are 
^ m Mpc^dex- 1 ) = 4.8 ± .3 x lO" 3 , log(M„/M ) + 
2 log h 70 = 9.96 ± 0.02 and a = -1.33 ± 0.02. To test the 
robustness of this HIMF estimate, we also applied a one- 
dimensional SWML approach to the flux-limited a.40i.s 
sample, and found results consistent with the global, two- 
dimensional result ( 0* = 4.5 ± .9 x 10~ 3 , log(M») = 
9.96 ± 0.04 and a = -1.36 ± 0.06). 

5.1.1. Measurement of £1 hi 

As in the case of the 1 /V m ax method, we calculate the 
neutral hydrogen density flni from an analytical integra- 
tion of the best-fit Schechter function and from a sum- 
mation over the points themselves. From the Schechter 
function we find flni = 4.3 ± 0.3 xlO -4 h^g 1 and from 
the binned points we find 4.4 ± 0.1 xl0~ 4 b-70 1 . In both 
cases, our result is consistent with the l/V mo ^ method 
and is slightly higher than the HIPASS result. The con- 
tribution by each bin is shown in Fig. [9] as open circles. 

6. DISCUSSION 

Fig. [H] compares the a. 40 HIMF derived via the 
1/V max method (filled circles) and the SWML method 
(open circles), and shows the difference between them in 
the bottom panel. The bin-by-bin differences between 
the SWML and 1/V m ax methods are small, and do not 
affect the measurement of £Ihi, though the faintest, most 
error-prone bins are found to be more populated in the 
SWML analysis. After we have corrected for the feature 
introduced to the 1/V ma x result by large-scale structure, 
we find excellent agreement between all measurements of 
a (-1.33 ± 0.02) and Q H i (4.3 ± 0.2). 

In the case of l/V ma;E , large-scale structure and the 
correction we estimate to deal with it have the largest 
impact on the final result. The 2DSWML method is de- 
signed to be insensitive to density fluctuations, and the 
agreement between the two measurements indicates that 
the large-scale structure correction is successful. 



6.1. Impact of the Virgo Cluster 

Measurements of the HI mass function can be sensi- 
tive to large-scale structure in the survey volume. As 
discussed above, we correct for large scale structure in 
the 1 /V m ax method to ameliorate this effect, but our 
2DSWML measurement could also be sensitive to this 
large nearby overdensity. To test the robustness of the 
1/V maa; correction and of our derived HIMF, we con- 
sider the result obtained when we exclude the portion of 
a. 40 that crosses the Virgo cluster. Many of our low- 
mass objects are contributed by this nearby overdensity, 
and our large scale structure correction mechanism is the 
largest in this region; if we are correcting appropriately, 
we should obtain the same result regardless of the inclu- 
sion of the Virgo sources. This test is imperfect, given 
that the local volume generally is overdense. We exclude 
all galaxies lying within our adopted Virgo field, covering 
12'' < a < 13 h and the full declinati on extent of the a. 40 
survey ()Trentham fc Hodgkinl 12002;) . reducing the sam- 
ple size to ~9200 for 1/V max and ~ 8600 for 2DSWML. 
Errors are measured as described above, but in this case 
we jackknife resample over only 18 subregions. 

Our results, within the errors, are the same whether or 
not we exclude the Virgo overdensity. This is true both 
for parameters and for our measurement of flni ■ In the 
case of l/V mQX , we again find that a Schechter summed 
with a Gaussian provides a better fit to the data by ac- 
counting for features introduced by large-scale structure 
in the foreground of the Pisces-Perseus supercluster In 
Table [1] we compare our findings for samples inclusive 
and exclusive of Virgo. Ad ditionally, we list the H IPASS 
HI mass function and the IStierwalt et~aT1 (|2009D HIMF 
of ALFALFA sources in the Leo group. In the case of the 
a. 40 and a.40i.s samples, we also list the value of £Ihi 
found by integrating the Schechter function fit or using 
the HIMF bin points. Each table entry is accompanied 
by lcr errors in parentheses. 

6.2. Comparison with Previous Work 

We find a value of Hhi that i s 16% higher tha n 
the complete HIPASS survey value (|Zwaan "etaLl r2005). 
That HIPASS result is excluded by our 2a errors, but the 
more preliminary HIPASS result (IZwaan et al.ll2003|) is 
in agreement with our result while carrying significantly 
larger error than we find. We also find log (M*/Mq) = 
9.96, so that the break in our HIMF occurs at masses 0.1 
dex higher than was found in either of the HIPASS anal- 
yses. Since the high-mass end of the HIMF is sensitive 
to M*, HIPASS significantly undercounts the highest- 
mass gas-rich galaxies. When our Schechter function is 
extrapolated to log(M*/M Q ) = 11.0, we predict an or- 
der of magnitude more galaxies than HIPASS. At more 
modest values, log (M*/M ) = 10.75, this is reduced to 
a factor of ~ 5. 

In Fig. [T^J we show the mass of a. 40 detections as 
a function of their distance in Mpc, and compare that 
to the HIPASS completeness and detection limits. The 
dashed vertical line shows the 12,700 km s _1 redshift 
cutoff of HIPASS assuming H = 70 km s" 1 Mpc" 1 ), 
demonstrating the ALFALFA survey's ability to probe 
the rare highest-mass galaxies at large redshifts. While 
the a. 40 sample extends only to 15,000 km s -1 / in or- 
der to avoid rfi, the full ALFALFA bandwidth allows 



ALFALFA HI Mass Function 9 



Table 1 

HI Mass Function Fit Parameters 



Sample and 


a 


<i>* 


log (M,/M ) 


n HI , fit 


Qhi, points 


Fitting Function 




\±\> ^7Q ivipc Qex j 


~f- Z log U70 




\ x iU n 70 > 


1/Vmttx 


-1.25 (0.02) 


6.0 (0.3) 


9.91 (0.01) 


4.4 (0.2) 


4.4 (0.1) 


Schechter + Gaussian 3 


-1.33 (0.03) 


3.7 (0.6) b 


9.95 (0.04) 






l/Vmax, Non- Virgo 


-1.20 (0.02) 


6.1 (0.3) 


9.90 (0.01) 


4.1 (0.2) 


4.3 (0.1) 


Schechter + Gaussian a 


-1.33 (0.04) 


3.1 (0.6) b 


9.95 (0.05) 






2DSWML 


-1.33 (0.02) 


4.8 (0.3) 


9.96 (0.02) 


4.3 (0.3) 


4.4 (0.1) 


2DSWML, Non- Virgo 


-1.34 (0.02) 


4.7 (0.3) 


9.96 (0.01) 


4.3 (0.3) 


4.4 (0.1) 


1/Vmoi, «.40l.8 


-1.30 (0.03) 


4.6 (0.3) 


9.96 (0.02) 


4.0 (0.3) 


4.0 (0.1) 


1DSWML, a.40i.s 


-1.36 (0.06) 


4.5 (0.9) 


9.96 (0.04) 


4.4 (0.9) 


4.3 (0.3) 


HIPASS (Zwaan et al. 2005) c 


-1.37 (0.06) 


5 (1) 


9.86 (0.04) 


3.7 (0.5) 




Leo Group (Sticrwalt ct al. 2009) cl 


-1.41 (0.2) 











a In the 1/V max case, pure Schechter functions provide a poor fit to the faint-end slope a, which explains the difference in ct for two fitting functions. The 
Gaussian component parameters arc not shown in the table, given that they arc not expected to be physical. 
b We defer to the 2DSWML measurement of due to the spurious feature in the 1/V ma3 . results. 

C Reported statistical and systematic errors combined in quadrature. 

The excluded parameters and M* in the Leo Group arc highly uncertain due to the lack of high-mass galaxies in its small volume. 



us to probe to 18,000 km s _1 . Given that the survey 
was designed to be sensitive at those greater redshifts, 
we are still abl e to observe many g alaxies at the limit of 
a. 40, while the. Zwaan et al. (2005) sample becomes very 
sparse near the survey's redshift limits. 

This improved measurement of the HIMF has impli- 
cations for work that relied upon the HIPASS results. 
Present-day HI surveys are limited in their ability to 
probe redshift space, even when they are targeted (z < 
0.5), so models of evolution of the HI mass function rely 
on the measurement at z = 0. Higher-precision measure- 
ments provide better constraints for evolutionary mod- 
els. Numerical mod els of galaxy formation and evolution 
(|Power et all 12010ft depend on the z = HIMF to as- 
sess the success of the models and to extrapolate that 
result to predi c tions for future HI surveys. For example, 
lAbdalla et all (|2010t ) predicted the ability of future HI 
line surveys with an instrument like the Square Kilome- 
ter Array (SKA) to constrain dark energy through mea- 
surements of the baryon acoustic oscillation scale. Those 
authors consider models of the HIMF evolution that are 
sensitive to the value M* . Typically, these galaxy models 
also depend on the assumed H2/HI ratio to convert sim- 
ulate d cold gas int o atom ic and molecular components 
(e.g. [Baugh ct al. (2004)), so updated estimates of ei- 
ther VLhi or £Ih 2 affect our ability to produce realistic 
models of gas-rich galaxies. 

We confirm previous findings that £Ihi at z = is in- 
consistent with the value inferred from damped Lyman 
absorber (DLA) systems at z ~ 2 and that significant 
evolution is required to reconc i le measurements in th e 
two epochs (|Noterdaeme et all 120091: iRao et all [2006), 
while providing a tighter constraint on the present-day 
energy density of cold gas. 

6.3. Comparison with Simulations 

lObreschkow et al.l (j2009ft (hereaf ter O09) used the 
Millen nium Simulation catalog, the iDe Lucia &: Blaizotl 
(2007) virtual catalog of galaxies, and a physically- 
motivated prescription to assign realistic gas (HI, He and 



H2) masses at a range of redshifts. While this catalog 
has a limited ability to realistically trace detailed galaxy 
evolution and limited mass resolution - down to about 
10 8 M Q of neutral hydrogen, which is comparable to 
the particle size in the Millennium run (S prineel et al.1 
120051 ) - it serves as the best currently available compari- 
son of observed gas-rich disks with the underlying theory 
of dark matter halos. 

6.3.1. Simulated HI Mass Function 

O09 derive an HI mass function that is, in its gros s 
properties, consistent with HIPASS (jZwaan et al.l F2005). 
ignoring spurious features near the mass resolution limit 
of the simulation. The O09 gas masses are obtained by 
combining the cold particle masses from the Millennium 
Run with a model to split the cold gas into molecular 
hydrogen and atomic hydrogen and helium components. 
Fig. IT3l compares the O09 HIMF, including only galax- 
ies with log(Af^f//M Q ) > 8.0 and at redshift z=0, with 
the 1 /V max and 2DSWML HIMFs derived in this work. 

= 3.4 xl0~ 4 inferred from the O09 HIMF is in 
good agreement with this work and with HIPASS. While 
it is clear that the overall statistical distribution of the 
cold gas prescription generally recovers the overall den- 
sity and the gross properties of the statistical distribu- 
tion, the details of the O09 HIMF disagree with obser- 
vations, particularly at the extreme low-mass end where 
the Millennium Run work suffers from poor resolution 
and inadequate merger histories. 

It is also worth noting that O09 report that they over- 
predict the number of high-mass sources in comparison 
to HIPASS, and suggest that this may be due to opac- 
ity in observed disks at these masses. However, we find 
that they underpredict high mass galaxies at z=0, the 
opposite effect. This is likely due to the O09 analysis 
of the HIMF, which is not limited to the final galaxies 
evolved to z=0; rather, their HIMF also includes galaxies 
at higher- redshift simulation snapshots which are pre- 
sumably more gas-rich than their present-day counter- 
parts. This would therefore overpredict the abundance 



10 



Martin et al. 



of high-mass galaxies. 

6.3.2. Famt-End Slope 

As has been found in previous work, the faint- 
end slope of the a. 40 HIMF is significantly shallower 
than the Press-Schecht er prediction of a ~ —1.8 
(jPress fc Schechterlll974D . Potentially, this difference can 
be linked to baryon loss and the suppression of accre- 
tion via photoheating in the low-mass dark matter halos. 
Simulations suggest that dark matter halos with masses 
below ~ 6.5 x 10 9 h _1 M Q result in baryon-p oor galaxies 
in present-day voids and other environ ments (jHoeft et al.l 
[2001 [2001 iHoeft fc GottloeberilMlOft . In principle, the 
discrepancy could be explained by an argument invoking 
the mass scale at which photoheating becomes impor- 
tant. 

A fitting function has been proposed ()Gnedinl|2000[ ) to 
describe the behavior of baryon fraction as a function of 
underlying halo mass: 



fb — fbO 



l + (2 



7/3 



1) 



M c 
M tn 



-3/ 7 



(3) 



where the parameters fj, and M c are, respectively, the 
baryon fraction in large halos and the characteristic halo 
mass where f(,=ff,o/2. 

If decreasing baryon fraction with decreasing halo mass 
explains the difference between low-mass slopes in bary- 
onic (stellar and HI) and halo mass functions, then this 
fitting function should consistently predict baryonic and 
cold gas mass functions with values of a ~ —1.3. In the 
low-mass limit, the first term of Eqn. [3] can be dropped 
and the total mass in a halo can be assumed to be domi- 
nated by the dark matter, M tot sa M^. Via the definition 
fs=Ms/M£) we have 



Mr 



Mb 
fs 



M f 



fi 



(2 



7/3 



\3/l 



Md 

M n 



(4) 



Compressing all constants gives the relation Mn oc 

M B /4 , which can then be used to relate the low-mass ends 
of the baryonic and dark matter mass functions. On the 
faint end of the dark matter mass function, the expo- 
nential term of the Schechter function can be dropped. 

j we can, finally, conclude that 



From dlogMp 



dn 



d log Mb 



oc 0» 



Ml 



(a D + l)/4 



(5) 



where (ajj + l)/4 = as + 1- Starting from the Press- 
Schcchter prediction of a faint-end slope ao ~ —1.8, the 
consideration of baryon fraction leads to ub ~ —1-2, 
which is more consistent with HI and stellar mass func- 
tions. In principle, the discrepancy between dark matter 
simulations and observed baryon mass functi ons could be 
explai ned b y the photoheating s i mulat ions of lHoeft et al.l 
(200$ and lHoeft fe Gottloeberl (|2010h . 

The baryon fraction of O09's simulated galaxies loosely 
follows this descriptive baryon fraction function (Eqn. 
[3]). However, the halo mass scale at which the baryon loss 
starts to drop steeply is about two orders of magnitude 
larger than the sc ale found by the de tailed hydrodynami- 
cal simulations of lHoeft et al.l (|2008l ). Additionally, there 



Table 2 

Faint-End Slopes of 
Modeled Baryon Mass 
Functions 



fb,0 


M c 


7 


a 


0.20 


9.0 


1.0 


-1.30 


0.20 


9.5 


1.0 


-1.27 


0.16 


9.0 


1.0 


-1.31 


0.16 


9.5 


1.0 


-1.28 


0.16 


9.0 


1.5 


-1.24 


0.16 


9.5 


1.5 


-1.22 


0.16 


9.0 


2.0 


-1.21 


0.16 


9.5 


2.0 


-1.19 


0.15 


9.0 


1.0 


-1.31 


0.15 


9.5 


1.0 


-1.28 


0.15 


9.0 


2.0 


-1.21 


0.15 


9.5 


2.0 


-1.19 



is large scatter in the mass interval of interest, since the 
simulation's resolution is poor for the halo masses where 
baryon loss beco mes important. Th e level of agreement 
between O09 and lHoeft et~aTI ([20081) is therefore difficult 
to quantify, and we use the latter's determination of /{, 
in what follows. 

Eqn. [3J suggests that the baryonic content of low-mass 
galaxies in a. 40 may be severely biased with respect to 
the underlying halo mass distribution. If simulations 
accurately predict the relationship between initial halo 
masses and resulting baryon fractions after reionization 
and photoheating, then the application of fb should pro- 
vide an estimate of the resulting baryon mass function 
at z=0. This depends on the extremely naive assump- 
tion that the cold HI gas content is depleted in the same 
fraction as the baryons overall. 

The publicly available GENMFcodcQ produces halo 
mass function fits to the iReed et "all (|2007l ) N-body sim- 
ulations at high resolution, from 10 5 to 10 12 h _1 M Q . We 
adopt their mass function at z=0, with their suggested 
parameters fi M « 0.238, SIa « 0.762, and <r s = 0.74 (at 
z=0), and apply Eqn. [3jto extract the predicted baryon 
mass function and fit the faint-end slope. The results are 
displayed in Table [5] for an exemplary set of values for 
fbfi, M c and 7. 

Through this approach, it is possible to modify the 
underlying halo mass function (a « -1.8) to meet our 
observations (a « -1.3). The suggestion that low-mas s 
halos may re-accrete cold gas at late times ( Ricotti 2009) , 
if substantiated, could further change the shape of the 
resulting baryon mass function. While this approach in- 
dicates we may be close to resolving the missing satel- 
lites problem and the discrepancy between predicted and 
observed faint-end mass function slopes, the precise re- 
quirements of baryon depletion mechanisms are not well- 
constrained by available simulations. 

7. CONCLUSIONS 

We have derived the HI mass function from a sample of 
^10,000 extragalactic sources comprising the ALFALFA 
40% Survey, and have adapted the 1/V max method to 
fully account for survey sensitivity, large-scale structure, 
and mass errors. We have demonstrated the robustness 

6 http: / /ice. dur.ac.uk/Research/PublicDownloads/gcnmf_readme. html 



ALFALFA HI Mass Function 



11 



of this method by testing flux-limited samples and by cal- 
culating the HIMF via a second approach, the structure- 
insensitive 2DSWML method. Our major result, the 
derivation of the global HIMF, indicates a Schechter 
function with parameters 0* (h 3 Mpc _3 dex~ 1 ) = 4.8 
± .3 x 1CT 3 , log(M*/M Q ) + 2 log h 70 = 9.96 ± 0.02 
and a = -1.33 ± 0.02. 

We find fl HI = 4.3 ± 0.3 xl0~ 4 h^ 1 , a robust con- 
straint that is 16% higher than the complete HIPASS sur- 
vey value 3.7 xl0~ 4 h^ 1 ([Zwaan et al. 2005), which we 
exclude at the 2a level. T he more preliminary HIPASS 
result (jZwaan et al.1 12003;) is in agreement with our re- 
sult, but carries a significantly larger error. When we ex- 
clude the Virgo cluster from our analysis, the flni value 
remains stable, indicating that our measurements are ro- 
bust against large-scale structure. In each case, we find 
the same value £Ihi whether derived from the binned 
HIMF points themselves or from the best-fit Schechter 
parameters. 

The larger values of Hhi and of M* that we find in 
comparison to HIPASS demonstrate ALFALFA'S advan- 
tage in detecting high-mass galaxies at large distances. 
On the extreme high-mass end of the HI mass func- 
tion, our measurement and the accompanying Schechter 
function predict an order of magnitude more galaxies at 
log(M////M Q ) ~ 11.0, and we find a factor of ^5 more 
galaxies at log(M#//M Q ) = 10.75. This has implica- 
tions for previous estimates of the detection rate of future 
large-scale HI line surveys with the SKA. 

We confirm previous findings that significant evolution 
in cold gas reservoirs must occur between z ~ 2 and z = 
given that Hhi is a factor of ~ 2 smaller in the for- 
mer epoch compared with the latter (|Noterdaeme et all 
120091 : iRao et al.ll2006D . Further, we suggest that work on 
photoheating and other processes that prevent low-mass 
dark matter halos from accreting gas may be coming 
close to explaining the so-called 'missing satellite prob- 
lem' at low redshift. Further numerical work, particu- 
larly at resolutions capable of recovering low densities of 
cold gas at z=0, is required in this area of research. 

Future work will consider the variation of the HI mass 
function with environment, and will include larger num- 
bers of galaxies across a full range of extragalactic en- 
vironments as the ALFALFA survey continues and new 
data products are released. 

The authors would like to acknowledge the work of 
the entire ALFALFA collaboration team in observing, 
flagging, and extracting the catalog of galaxies used in 
this work. 

This work was supported by NSF grants AST-0607007 
and AST-9397661, and by grants from the National De- 
fense Science and Engineering Graduate (NDSEG) fel- 
lowship and from the Brinson Foundation. 

REFERENCES 



Abdalla, F. B., Blake, C, & Rawlings, S. 2010, MNRAS, 401, 743 
Baldry, I. K., Glazebrook, K., Brinkmann, J., Ivezic, Z., Lupton, 

R. H., Nichol, R. C, & Szalay, A. S. 2004, ApJ, 600, 681 
Baugh, C. M., Lacey, C. G., Frenk, C. S., Benson, A. J., Cole, S., 

Granato, G. L., Silva, L., & Bressan, A. 2004, New Astronomy 

Review, 48, 1239 



Boylan-Kolchin, M., Springel, V., White, S. D. M., Jenkins, A., & 

Lemson, G. 2009, MNRAS, 398, 1150 
Branchini, E., Teodoro, L., Frenk, C. S., Schmoldt, I., Efstathiou, 

G., White, S. D. M., Saunders, W., Sutherland, W., 

Rowan- Robinson, M., Keeble, O., Tadros, H., Maddox, S., & 

Oliver, S. 1999, MNRAS, 308, 1 
Ceverino, D., & Klypin, A. 2009, ApJ, 695, 292 
Davis, M., & Huchra, J. 1982, ApJ, 254, 437 
De Lucia, G., & Blaizot, J. 2007, MNRAS, 375, 2 
Efstathiou, G. 2000, MNRAS, 317, 697 

Efstathiou, G., Ellis, R. S., & Peterson, B. A. 1988, MNRAS, 232, 
431 

Fukugita, M., Hogan, C. J., & Peebles, P. J. E. 1998, ApJ, 503, 
518 

Fukugita, M., & Peebles, P. J. E. 2004, ApJ, 616, 643 
Geha, M., Blanton, M. R., Masjedi, M., & West, A. A. 2006, ApJ, 
653, 240 

Giovanelli, R., Haynes, M. P., Kent, B. R., & Adams, E. A. K. 
2010, ApJ, 708, L22 

Giovanelli, R., Haynes, M. P., Kent, B. R., Perillat, P., Catinella, 
B., Hoffman, G. L., Momjian, E., Rosenberg, J. L., Saintonge, 
A., Spekkens, K., Stierwalt, S., Brosch, N., Masters, K. L., 
Springob, C. M., Karachentsev, 1. D., Karachentseva, V. E., 
Koopmann, R. A., Muller, E., van Driel, W., & van Zee, L. 
2005, AJ, 130, 2613 

Giovanelli, R., Haynes, M. P., Kent, B. R., Saintonge, A., 
Stierwalt, S., Altaf, A., Balonek, T., Brosch, N., Brown, S., 
Catinella, B., Furniss, A., Goldstein, J., Hoffman, G. L., 
Koopmann, R. A., Kornreich, D. A., Mahmood, B., Martin, 
A. M., Masters, K. L., Mitschang, A., Momjian, E., Nair, P. H., 
Rosenberg, J. L., & Walsh, B. 2007, AJ, 133, 2569 

Gnedin, N. Y. 2000, ApJ, 542, 535 

Gnedin, N. Y., Tassis, K., & Kravtsov, A. V. 2009, ApJ, 697, 55 
Governato, F., Willman, B., Mayer, L., Brooks, A., Stinson, G., 

Valenzuela, O., Wadsley, J., & Quinn, T. 2007, MNRAS, 374, 

1479 

Grebel, E. K., & Gallagher, HI, J. S. 2004, ApJ, 610, L89 
Haynes, M. P., Giovanelli, R., & Chincarini, G. L. 1984, ARA&A, 
22, 445 

Henning, P. A., Staveley-Smith, L., Ekers, R. D., Green, A. J., 
Haynes, R. F., Juraszek, S., Kesteven, M. J., Koribalski, B., 
Kraan-Korteweg, R. C, Price, R. M., Sadler, E. M., & 
Schroder, A. 2000, AJ, 119, 2686 

Hoeft, M., & Gottloeber, S. 2010, ArXiv e-prints 

Hoeft, M., Yepes, G., & Gottlober, S. 2008, in IAU Symposium, 
Vol. 244, IAU Symposium, ed. J. Davies & M. Disney, 279-283 

Hoeft, M., Yepes, G., Gottlober, S., & Springel, V. 2006, 
MNRAS, 371, 401 

Irwin, M. J., Belokurov, V., Evans, N. W., Ryan-Weber, E. V., de 
Jong, J. T. A., Koposov, S., Zucker, D. B., Hodgkin, S. T., 
Gilmore, G., Prema, P., Hebb, L., Begum, A., Fellhauer, M., 
Hcwett, P. C, Kennicutt, Jr., R. C, Wilkinson, M. 1., Bramich, 
D. M., Vidrih, S., Rix, H., Beers, T. C, Barentine, J. C, 
Brewington, H., Harvanek, M., Krzesinski, J., Long, D., Nitta, 
A., & Snedden, S. A. 2007, ApJ, 656, L13 

Jenkins, A., Frenk, C. S., White, S. D. M., Colberg, J. M., Cole, 
S., Evrard, A. E., Couchman, H. M. P., & Yoshida, N. 2001, 
MNRAS, 321, 372 

Kent, B. R., Giovanelli, R., Haynes, M. P., Martin, A. M., 
Saintonge, A., Stierwalt, S., Balonek, T. J., Brosch, N., &; 
Koopmann, R. A. 2008, AJ, 136, 713 

Li, C, & White, S. D. M. 2009, MNRAS, 398, 2177 

Loveday, J. 2000, MNRAS, 312, 557 

Madgwick, D. S., Lahav, O., Baldry, I. K., Baugh, C. M., 

Bland-Hawthorn, J., Bridges, T., Cannon, R., Cole, S., Colless, 
M., Collins, C, Couch, W., Dalton, G., De Propris, R., Driver, 
S. P., Efstathiou, G., Ellis, R. S., Frenk, C. S., Glazebrook, K., 
Jackson, C, Lewis, I., Lumsden, S., Maddox, S., Norberg, P., 
Peacock, J. A., Peterson, B. A., Sutherland, W., & Taylor, K. 
2002, MNRAS, 333, 133 

Martin, A. M., Giovanelli, R., Haynes, M. P., Saintonge, A., 
Hoffman, G. L., Kent, B. R., & Stierwalt, S. 2009, ApJS, 183, 
214 

Masters, K. L. 2005, PhD thesis, Cornell University, United 

States — New York 
Masters, K. L., Haynes, M. P., & Giovanelli, R. 2004, ApJ, 607, 

L115 



12 



Martin et al. 



Mayer, L., Governato, F., & Kaufmann, T. 2008, Advanced 
Science Letters, 1, 7 

Meyer, M. J., Zwaan, M. A., Webster, R. L., Staveley-Smith, L., 
Ryan-Weber, E., Drinkwater, M. J., Barnes, D. G., Howlett, 
M., Kilborn, V. A., Stevens, J., Waugh, M., Pierce, M. J., 
Bhathal, R., de Blok, W. J. G., Disney, M. J., Ekers, R. D., 
Freeman, K. C, Garcia, D. A., Gibson, B. K., Harnett, J., 
Henning, P. A., Jerjen, H., Kesteven, M. J., Knezek, P. M., 
Koribalski, B. S., Mader, S., Marquarding, M., Minchin, R. F., 
O'Brien, J., Oosterloo, T., Price, R. M., Putman, M. E., Ryder, 
S. D., Sadler, E. M., Stewart, I. M., Stootman, F., & Wright, 
A. E. 2004, MNRAS, 350, 1195 

Noterdaeme, P., Petitjean, P., Ledoux, C, & Srianand, R. 2009, 
A&A, 505, 1087 

Obreschkow, D., Croton, D., D eLucia, G., Khochfar, S., & 
Rawlings, S. 2009, ApJ, 698, 1467 

Power, C, Baugh, C. M., & Lacey, C. G. 2010, MNRAS, 974 

Press, W. H., & Schechter, P. 1974, ApJ, 187, 425 

Prochaska, J. X., & Tumlinson, J. 2009, Baryons: What.When 
and Where?, ed. Thronson, H. A., Stiavelli, M., & Tielens, A., 
419 — h 

Rao, S. M., Turnshek, D. A., & Nestor, D. B. 2006, ApJ, 636, 610 
Reed, D. S., Bower, R., Frenk, C. S., Jenkins, A., & Themis, T. 

2007, MNRAS, 374, 2 
Ricotti, M. 2009, MNRAS, 392, L45 

Rosenberg, J. L., & Schneider, S. E. 2002, ApJ, 567, 247 
Ryan-Weber, E. V., Begum, A., Oosterloo, T., Pal, S., Irwin, 

M. J., Belokurov, V., Evans, N. W., & Zucker, D. B. 2008, 

MNRAS, 384, 535 
Saintonge, A. 2007, AJ, 133, 2087 

Saintonge, A., Giovanelli, R., Haynes, M. P., Hoffman, G. L., 
Kent, B. R., Martin, A. M., Stierwalt, S., & Brosch, N. 2008, 
AJ, 135, 588 

Sandage, A., Tammann, G. A., & Yahil, A. 1979, ApJ, 232, 352 
Schmidt, M. 1968, ApJ, 151, 393 

Schombert, J. M., McGaugh, S. S., & Eder, J. A. 2001, AJ, 121, 
2420 

Simon, J. D., & Geha, M. 2007, ApJ, 670, 313 



Springel, V., White, S. D. M., Jenkins, A., Frenk, C. S., Yoshida, 
N., Gao, L., Navarro, J., Thacker, R., Croton, D., Helly, J., 
Peacock, J. A., Cole, S., Thomas, P., Couchman, H, Evrard, 
A., Colberg, J., & Pearce, F. 2005, Nature, 435, 629 

Springob, C. M., Haynes, M. P., & Giovanelli, R. 2005, ApJ, 621, 
215 

Springob, C. M., Masters, K. L., Haynes, M. P., Giovanelli, R., k, 

Marinoni, C. 2007, ApJS, 172, 599 
Stierwalt, S., Haynes, M. P., Giovanelli, R., Kent, B. R., Martin, 
A. M., Saintonge, A., Karachentsev, I. D., & Karachentseva, 
V. E. 2009, AJ, 138, 338 
Tolstoy, E., Hill, V., &; Tosi, M. 2009, ARA&A, 47, 371 
Trentham, N. 1998, MNRAS, 294, 193 
Trentham, N., & Hodgkin, S. 2002, MNRAS, 333, 423 
Wong, O. I., Ryan- Weber, E. V., Garcia- Appadoo, D. A., 
Webster, R. L., Staveley-Smith, L., Zwaan, M. A., Meyer, 
M. J., Barnes, D. G., Kilborn, V. A., Bhathal, R., de Blok, 
W. J. G., Disney, M. J., Doyle, M. T., Drinkwater, M. J., 
Ekers, R. D., Freeman, K. C, Gibson, B. K., Gurovich, S., 
Harnett, J., Henning, P. A., Jerjen, H., Kesteven, M. J., 
Knezek, P. M., Koribalski, B. S., Mader, S., Marquarding, M., 
Minchin, R. F., O'Brien, J., Putman, M. E., Ryder, S. D., 
Sadler, E. M., Stevens, J., Stewart, I. M., Stootman, F., & 
Waugh, M. 2006, MNRAS, 371, 1855 
Zwaan, M., Meyer, M., & Staveley-Smith, L. 2009, ArXiv e-prints 
Zwaan, M. A., Meyer, M. J., Staveley-Smith, L., & Webster, 

R. L. 2005, MNRAS, 359, L30 
Zwaan, M. A., Staveley-Smith, L., Koribalski, B. S., Henning, 
P. A., Kilborn, V. A., Ryder, S. D., Barnes, D. G., Bhathal, R., 
Boyce, P. J., de Blok, W. J. G., Disney, M. J., Drinkwater, 
M. J., Ekers, R. D., Freeman, K. C, Gibson, B. K., Green, 
A. J., Haynes, R. F., Jerjen, H., Juraszek, S., Kesteven, M. J., 
Knezek, P. M., Kraan-Korteweg, R. C, Mader, S., 
Marquarding, M., Meyer, M., Minchin, R. F., Mould, J. R., 
O'Brien, J., Oosterloo, T., Price, R. M., Putman, M. E., 
Ryan-Weber, E., Sadler, E. M., Schroder, A., Stewart, I. M., 
Stootman, F., Warren, B., Waugh, M., Webster, R. L., & 
Wright, A. E. 2003, AJ, 125, 2842 



APPENDIX 

A. DETAILS OF CORRECTIONS TO THE l/V M AX METHOD 

A.l. Width- Dependent Sensitivity Correction 

Giovane lli et al.l (|2005l ) predicted, from the precursor survey observations, that ALFALFA in full two-drift mode 
could expect an approximate integrated flux detection threshold, Si n t t th in Jy km s , dependent upon profile width 
as follows: 

q _{ 0.15 S/N{W 50 /2QQ) 1 / 2 , W 50 < 200 rAn 
o mt ,th- | .155/iV(W5o/200), W 50 > 200 [AL) 

In practice, however, ALFALFA outperforms this detection threshold, and we therefore use the data itself to fit a 
detection limit as described in 33.31 

The width-dependent sensitivity correction is based on the distribution of observed profile widths. We also assume 
that the distribution of observed galaxies gives an indication of the true underlying distribution. We are therefore 
interested in working with as many sample galaxies as possible, and thus we consider a detection threshold Si n t,th as 
a function of W50 that indicates the limits of ALFALFA'S detection ability, rather than a strict completeness limit as 
in the 2DSWML case (gHHJ). 

The completeness correction is based on the relationship of galaxy mass to the distribution of profile widths W50 . It 
is known that HI profile widths and masses are correlated, and we observe a mass-dependent spread in the distribution 
of profile width. We determine the profile width distribution as a function of mass by binning a. 40 galaxies by 
\og(Mni /Mq) and fitting to each histogram a Gumbel (or Extreme Value Type 1) distribution: 

f{x) = - p e^e~^ (A2) 

where parametrizes the center of the distribution and (3 its breadth. The profile width distributions feature narrow 
central peaks and extended skewed tails, which the Gumbel distribution is designed specifically to model. 

We find that the center of the profile width distribution increases linearly with \og(Mni /Mq), and the breadth 
decreases linearly with log(M^f//Af Q ). We derive a relationship between \og(MHi /M Q ) and the parameters [i and f3, 
in order to extrapolate to any mass and infer the underlying distribution of W50 to which a given galaxy belongs, 



ALFALFA HI Mass Function 



13 



P(W / 5o, Mhi)- The probability of detecting a galaxy in a given mass bin depends on the profile width distribution for 
that bin, as well as the limiting profile width W5o,ii m beyond which that galaxy would not be detectable by ALFALFA. 
We are seeking a correction factor C that will account for the profile width-integrated flux bias and that satisfies the 
relationship 

A galaxies (M m ) = C N obs (M m ) (A3) 

where N ga i a xies is the corrected galaxy count to be input for the calculation of the HIMF, and N oos is the observed 
galaxy count. In terms of the derived distribution P(W§o, Mhi), we have 

c = gjwg^ 

J W ^ Hm P(W 50 ,MHi)dW 50 

Since a bin is made up of galaxies with varying W^o.um, we apply this correction to each individual galaxy, rather 
than on a mass bin-by-bin basis. The sum over effective search volume, Y>l/V ma x, therefore becomes T,C/V max . 

To be conservative, we have included the errors on our derived linear relationships between log(Mjn /Mq) and the 
Gumbel distribution parameters [i and ft in our final error analysis for the HI mass function. 

A. 2. Large Scale Structure Correction 

The 1/V max method would be biased by large scale structure if we counted galaxies in overdense regions with the 
same weight as their counterparts in voids. Instead, we want to consider the effective search volume V max ^ e ff in such 
a way that overdense regions are counted as contributing more effective volume to the overall survey. 

We modify TA/V max to include weighting by the average density n(V ma x) interior to D max , normalize d to the average 
densi ty of the Universe. The expression for measuring the HIM F then becomes SI /n( V max )V max (|Springob et al.l 
2005). We obtain n(V ma x) from the PSCz density reconstruction of Branchi ni et al.l (|1999ft . using their Cartesian map 
of evenly-spaced grid points out to 240 Mpc h _1 smoothed to 3.2 Mpc h™ 1 and using our assumed value h = 0.7. For 
values D max >~ 85 Mpc, the average density interior to D max becomes equal to the average density in the PSCz map, 
so no correction is needed. The large scale structure correction is therefore small compared to the Poisson counting 
error for galaxies with log (Mhi /Mq) > 9.0, which are found at large distances. 

This weighting scheme for galaxy counts in over- and under-abundant regions corrects the relative counts between 
different environments, so that clusters and superclusters don't dominate the shape of the measured HIMF. 

B. DETAILS OF THE 2DSWML METHOD 

In the case of a sample such as a. 40, which is not flux-limited and instead depends on additional observables, we 
must consider a bivariate or two-dimensional stepwise maximum likelihood (2DSWML) approach. In this bivariatc 
case, the likelihood of finding a galaxy with HI mass Mhi,% and velocity width Wsqj at distance Di is given by 

£ . = (t>(M HIii ,W 5 o,i) 

Sw M =o lM HI =M HI>lim ( Di ,w 50 ) 4>(Mhi,W 50 ) dM H idW 50 

where Mjjj u m (Di, W50) is the minimum detectable mass at distance Di for a galaxy with velocity width W50, calcu- 
lated using the completeness relationship in integrated flux- velocity width space as described above. 

We proceed by splitting the distribution in bins of m — log(Mni /Mq) and w = log W^o, and assume a constant value 
within each bin. This leads to the Two-Dimensional Step Wise Maximum Likelihood (2DSWML) technique, where 
the parameters of the two-dimensional distribu tion can now be written as <pjk (j = 1,2,..., N m and k — 1,2,..., N w ). 
The individual likelihood for each galaxy (Eqn. IB1|) becomes 

EiEfcWifc (B2) 



J2j J2k H ijk (j> jk AmAw ' 



where the set of coefficients V%jh are used to ensure that only the value for the bin to which galaxy i belongs appears 

in the numerator and the coefficients Hijk are used to enforce the summation in the denominator to go only over the 
area in the (m, w) plane where galaxies could be detectable at distance Di. More precisely, 

T , _ J 1 if galaxy i belongs to mass bin j and width bin k {~dq\ 

Vi i k - \ otherwise { - ti6 > 

and, if we denote the completeness function in the (m, w) plane for galaxies at distance Di by Cj(m, w), 

1 r w t r m t 

Hijk = -z — t — / / Ci(m,w)dmdw (B4) 
AmAv 1 



w. J in 



14 



Martin et al. 



where m- and m^ are the HI mass at the lower and upper boundary of mass bin j correspondingly and similarly 



and are the upper and lower boundaries of width bin k. The complctncss function in the mass-width plane, 
Ci(m,w), is directly derived from the a. 40 sample data, as in Fig. [T] For the 2DSWML method we restrict ourselves 
to galaxies above a strict completeness cut as a function of W50, where the completeness is 1, excluding 321 galaxies 
(~ 3% of a. 40) from the calculation of the mass function. 

The goal of the 2DSWML approach is to find the values of the parameters <j>jk that maximize the joint likelihood 
of finding all the galaxies in the sam ple simoultaneously, C — \\ i In practice it is more convenient to maximize the 
log-likelihood, which using Eqn. IB21 can be written as 

hxC = 1*4 =EEE V ^ k H^kAmAw) 

i i j k 

— ^ln j Hijk(j)jkAmAw + const. (B5) 

i \ j k ) 

ln£ is maximized by setting the partial derivatives with respect to each of the parameters equal to zero, giving 

^ - H l]k - ^ H %]k ( B6 ) 

^* £ m E„ n tmn <p mn j2 m En H imn $ mn 

where rijk is the galaxy count in bin j, k. The Maximum Likelihood values for each parameter can be found by 
iterating Eqn. IB6I until a stable solution is obtained. Finally, the HI mass distribution can be derived by the bivariate 
HI mass- velocity width distribution by marginalizing over velocity width, or 

h = fak Aw - ( B7 ) 

k 

Marginalizing the bivariate distribution over HI mass leads, instead, to the projected velocity width function for HI 
bearing galaxies, which will be the focus of a forthcoming publication. 

As Eqns. IB1I & IB2I imply, the overall normalization is lost in the process, and only the relative values of the 
parameters <pjk are meaningful. Fixing the amplitude gives the HI mass function. 

B.l. HIMF Amplitude 

To transform the calculated probability density function into an HI mass function (e.g. transform the unitlcss 
{0feAr7i} into space densities) we evaluate the amplitude of the HIMF by matching the in t egral of the distribution 
to the inferred average density of galaxies in the survey volume n, as in iZwaan et all (|2003l) . [5 avis fc Huchral ((1982^ 
discuss various estimators for n that strike different balances between stability against poor knowledge of the selection 
function of the survey and immunity to large-scale structure. Since we believe we have a good understanding of the 
selection function out to cz = 15000 km s _1 , we choose to adopt the estimator that is least prone to bias, denoted by 
m, defined as 

_ ! f n(D) dD 

^1 ^survey J S (D^j V / 

where n(D) dD is the number of galaxies in a spherical shell of thickness dD and radius D, and V survey is the total 
survey volume. The selection function S(D) is the fraction of galaxies detectable at distance D and is given by 

rwrna.* rm ma x 6(m, w) dm dw 

J J (f>(m, w) dm dw 



In the case of the 2DSWML method we evaluate n\ by the expression 

1 



^1 ^survey ^ ] y-v 



k H ijk (t>jkA.mAw 



(B10) 



Eqn. IB10I corresponds to weighing each detected galaxy in the survey by the inverse of the selection function at the 
galaxy's distance, effectively correcting each detection by the fraction of galaxies that cannot be detected at distance 

Da. 



ALFALFA HI Mass Function 



15 



3.0 F 



2.0 - 



1.0 



0.0 



-1.0 t 



f. ! ,:;±-S 




1.0 



1.5 



2.0 

og W 50 , km s _1 



2.5 



3.0 



Figure 1. The distribution of sources detectable by ALFALFA, which is dependent on both flux Si„t in Jy km s 1 and profile width W50 
in km s — 1 . 




22h 



Figure 2. Distribution of 2,004 sources in the 22 h < a < 03 h , 24° < 6 < 32° portion of the a. 40 sample, plotted as R.A. vs. observed 
heliocentric recession velocity in km s _1 . 



16 Martin ct al. 



1 2h 




12h 

13h , . . i 1 1h 

. " \ *i„ ••'•v-'-.joh 

• • . . • y • 



1 tt • ■ "■ -"-"■•-Mfc . .a ■ 



07 h 30 m < a < 16"30 m 
24° < S < 28° 



5000 ' ..".V. 



Figure 3. Top panel: Distribution of 5,960 sources in the OT^O" 1 < a < 16 h 30 m , 4° < S < 16° portion of the a.40 sample, plotted 
as R.A. vs. observed heliocentric recession velocity in km s _1 . Bottom panel: 2,155 sources over the same R.A. range as above, with 
24° < <5 < 28°. 



ALFALFA HI Mass Function 



17 





6 7 8 9 10 11 

log 10 (M HI /M Q ) 



Figure 5. Histogram of the distribution of HI masses in the sample, plotted as logarithm of the HI mass in solar units. 



18 



Martin et al. 




5000 10000 15000 

Velocity, km s ] 



Figure 6. The average relative weight within the 40% ALFALFA survey volume as a function of observed heliocentric velocity. Where the 
relative weight is near 1.0, nearly the entire surveyed volume was accessible for source extraction, and the regions of lower relative weight 
correspond to manmade radio frequency interference. These sources are not always present, and do not always result in a complete loss of 
signal, so there are regions where the average weight is reduced only modestly. The large dip between 15000 and 16000 km s — 1 is due to 
the FAA radar at the San Juan airport, and because of this extreme loss of volume at large distances we restrict our sample to only those 
galaxies within 15000 km s — 1 . 



F 




6 7 8 9 10 11 

log lo (M H1 /M ) 

Figure 7. The global HI mass function derived from a AO via the l/Wmux method. Points are the HIMF value, per dex, in each mass 
bin, with errors as described in the text overplotted. The black dotted line is the Schechter function fit to the points, and the red solid line 
is the sum of a Schechter function and a Gaussian fit to the points. The histogram, bottom panel, shows the logarithm of the bin counts. 



ALFALFA HI Mass Function 



19 



0.4 




-0.4 [ 1 

6 7 8 9 10 11 

log, (M HI /M o ) 

Figure 8. The residuals between the 1/Vmoi HIMF points and the derived best-fit Schcchtcr function (top panel) and the best-fit sum 
of a Schcchter and a Gaussian (bottom panel). Bars represent the errors on each point, to show the significance of the residual in each 
case. The Schechter function provides a poor fit to the spurious 'bump' feature, and this effect is reduced by the addition of a Gaussian 
component. The highest-mass bin, which has a large error value, is excluded from this plot. 



i 1 " 5 1 

I 




6 7 8 9 10 11 

l°g,o(M H i/M e ) 



Figure 9. The contribution to Qhi by the galaxies in each bin in a. 40. Filled circles have been calculated via the 1/Vmoi method, and 
open circles are from the 2DSWML method. The total density of neutral hydrogen in the local Universe is dominated by galaxies with 
9.0 < \og(M HI /M Q ) < 10.0. 



20 



Martin et al. 



X 

CD 
T3 



1 r 



O 



■e- -4 



-IS . j 



0* = O.OO48 
log(M,) = 9.96 
a = -1.33 




7 8 9 

l°glo( M Hl/ M ©) 



10 



11 



Figure 10. The global HI mass function derived from a. 40 via the 2DSWML method. As in Fig. [7] points are the HIMF value, per 
dex, in each mass bin, with errors as described in the text overplotted. The dotted line is the Schechter function fit to the points and the 
Schechter function parameters are listed. The histogram, bottom panel, shows the logarithm of the bin counts. 



-a 



a 
2 



o 



o r 
-l f 

-2 I 

-3 - 

-4 f 

-5 I- 
-6 L 



I ■ 



•»»».., 



0.4 
0.2 
0.0 
-0.2 
-0.4 



4 



6 



8 9 
lo gio( M Hi/ M e) 



10 



1 1 



Figure 11. Top panel: The HIMF derived from o.40 with the 1/Vmoi method (filled circles) and the 2DSWML method (open circles), 
with error bars. Bottom panel: The difference between the HIMF points, shown above, derived from the 1/Vmai and 2DSWML methods. 



ALFALFA HI Mass Function 



21 




50 100 150 200 

Distance (Mpc) 



Figure 12. a. 40 detections plotted as log^Mjjl /Mq) vs. distance in Mpc. The upper (blue) solid line is the HIPASS completeness limit, 
and the lower (red) solid line is the HIPASS detection limit. The dashed vertical line shows the redshift limit of HIPASS assuming the 
ALFALFA adopted value H = 70 km s _1 Mpc -1 . 



. -| 1 1 1 i 1 1 1 1 i 1 1 1 1 r 

- 8 » V V V 

-2 r v v « 



v 8 

V 8 



X 
(D 

"O " v » 

V 

i -3 r s 

CJ : v 

Q_ 



-4 r 

^ i V Obreschkow et al. 2009 



o 



CP —5 - 
O ° : 



• a40, 1/V mox 
! O a40, 2DSWML 

6 P. ... i , . , , i 



8.0 8.5 9.0 9.5 10.0 10.5 11.0 
log 1o (M HI /M ) 

Figure 13. The HIMF of the Obreschkow ct al. ( 2009) analysis of cool gas in simulated galaxies from the Millennium run (open triangles), 
compared to the a. 40 l/Vmax (filled circles) and 2DSWML (open circles) HIMFs. The ALFALFA sample is divided to 5 mass bins per 
dex, and the simulated galaxies to 8 bins per dcx. Only the mass range log(Mnj /Mg) > 8.0 is displayed, due to poor mass resolution in 
O09, and the simulated galaxy sample includes only galaxies at redshift z=0. For the ALFALFA HIMF, error bars represent both counting 
and mass estimate errors, but errors on the O09 HIMF are based on Poisson counting only. Where not visible, error bars are smaller than 
the plotted symbol size. 



